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1. Introduction 

The Density Matrix Renormalization Group (DMRG) method is a successful technique 
for simulating large low-dimensional quantum mechanical systems Developed 
for computing ground states of ID Hamiltonians, it is equivalent to a variational 
ansatz known as Matrix Product States (MPS) [2 El- This relation has been recently 
exploited to develop a much wider family of algorithms for simulating quantum 
systems, including time evolution [5], finite temperature |3] and excitation 
spectra [7]. Some of these algorithms have been translated back to the DMRG 
language [5[ E3 using optimizations developed in that field and introducing other 
techniques such as Runge-Kutta or Lanczos approximations of the evolution operator 

dH 112 M IH 

The MPS are a hierarchy of variational spaces, Sd, [See Eq. Q] sorted by the size 
of its matrices, D. MPS can efficiently represent many-body states of ID systems, even 
when the Hilbert space is so big that the coefficients of a pure state on an arbitrary 
basis cannot be stored in any computer. While the accuracy of this representation 
has been proven for ground states evolution of an arbitrary state changes the 
entanglement among its parties, and a MPS description with moderate resources (small 
D) might stop to be feasible. 

We will take a pragmatic approach. First of all, most algorithms in this 
work can compute truncation errors so that the accuracy of simulations remains 
controlled. Second, we are interested in simulating physically small problems, such as 
the dynamics of atoms and molecules in optical lattices. For such problems small 
D are sufficient to get a qualitative and even quantitatively good description of 
the observables in our systems. As we will see below, the biggest problem is the 
accumulation of truncation errors and not always the potential accuracy of a given 
MPS space for representing our states. 

The outline of the paper is as follows. In Sect. |2 we briefly introduce MPS and 
review some of their properties. In Subsect. 12.21 we present the optimal projection 
onto a MPS space, which is the keystone of most evolution algorithms. In Sect.|2we 
introduce for completeness the DMRG algorithm, focusing on the concepts which are 
essential for time evolution. In particular, we concentrate on the difficulties faced when 
implementing DMRG simulations and how those techniques relate to MPS. In Sect.0] 
we review almost all recently developed simulation algorithms under the common 
formalism based on the optimal truncation operator. Additionally we introduce three 
new methods: two of them are based on Taylor and Pade expansions of the evolution 
operator while the other one uses an "Arnoldi" basis of MPS increase the accuracy. 
Sect.|5]is a detailed comparison of MPS and DMRG algorithms using spin— i models. 
Our study shows that all methods are strongly limited by truncation and rounding 
errors. However, among all techniques, MPS methods and in particular our Arnoldi 
method performs best for fixed resources, measured by the size of matrices D or 
size of basis in DMRG. In the last part of our paper, Sect. we present a real- 
world application of the Arnolid evolution algorithm, which is to study a model of 
hard-core bosonic atoms going through a Feschbach resonance. Current experiments 
1171 IT%] with such systems have focused on the number and stability of the formed 
molecules. In this work we focus on the ID many-body states and show that coherence 
is transferred from the atomic component to the molecular one, so that this procedure 
can be used to probe higher order correlations in the atomic cloud. Finally, in Sect.0 
we summarize our results and open lines of research. 
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2. Matrix Product States (MPS) 

In this first section we introduce the notion of Matrix Product State, together with 
some useful properties and an important operator, the projection onto a MPS space. 
This section is a brief review of the concepts found in Refs. [31 E] - 



2.1. Variational space 

The techniques in this work are designed for the study of one-dimensional or quasi- 
one-dimensional quantum mechanical lattice models. If N is the number of lattice 
components and d the number of degrees of freedom of each site, the Hilbert space 
of states will have a tensor product structure, TL — Tif N , where d = dim7ii are the 
degrees of freedom of a single site. We will consider two examples here: one with 
spin-1/2 particles, where d — 2 (Scct.[5j), and later on a study of atoms and molecules 
in an optical lattice where d — 25 (Sect.EJ. 

Given those prerequisites, the space of MPS of size D is defined as the set of 
states of the form 

So :={^n^*J |*i •••**>, A?eC D ** D *+*y (1) 

where s k = 1 . . . d labels the physical state of the fc-th lattice site. The A k are complex 
matrices of dimensions that may change from site to site but are of bounded size, 
Dk-i x Dk < D 2 . Throughout the paper we will use different notation for the matrices 
{A^ fc }. An index k or / will always label the site they belong to and, whenever the 
expression is not ambiguous, the site index will be dropped: A Sk :— At h ■ The MPS 
components can also be regarded as tensors, A s ^g, the Greek indices denoting virtual 
dimensions a, /3 = 1 ... D. Finally, at some point we will rearrange all values forming 
a vector A\ := (A^,^,. . . , A d DD ) in a complex space C dxDxD . 

The first important property of the MPS is that they do not form a vector space. 
In general, a linear combination of M states with size D requires matrices of size MD 
to be represented. It is easy to do a constructive proof of the previous fact, but the 
reader may convince himself with a simple example, made of the two product states 
\0)® N and 1)®^, which live in Si, and the GHZ state |0) 8Ar + 11}®^ £ S 2 - 

The previous remark leads us to another property, which is that the dimension 
D required to represent a state faithfully^ is related to the amount of entanglement in 
the system. More precisely, that dimension is equal to the maximum Schmidt number 
of the state with respect to any bipartition and thus an entanglement monotone |19| . 
Indeed, creating entanglement forces us to use bigger and bigger MPS and this is the 
reason why some problems cannot be simulated efficiently using MPS. 

The third important property is that we can efficiently compute scalar products, 
distances and in general expectation values of the form ® 02 <8> • • • <8> Ol\4>), 

where if>,(f) S So and {Oi}f =1 are local operators acting on the individiual qubits or 
components of our tensor product Hilbert space. For instance, given that we know 
the matrices of those states, {A^ fc } for ip and {B^ 1 "} for (f>, the previous expectation 
value is made of a product of N matrices, 

" L 

l[E(O k ,A k ,B k ) , (2) 

Jfe=i 



(1>\®Li Ok\4>) =Tr 



\ By this we mean with absolutely no error. 
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where the "transfer matrices" are defined as follows 



E(O k ,A k ,B k ) :=^(^ fc )* ®£^< Sfc |O fc | Sfe ). 



(3) 



Since all usual Hamiltonians and correlators can be decomposed as sum of products 
of local terms, the previous formulas are very useful. An important remark is that 
when computing Eq. © one should not directly multiply the matrices E, but cleverly 
contract the A and B tensors alternatively, so as to achieve a performance 0(dD 3 ). 

The last property is that expectation values, distances \\ip — ^>|| 2 , fidelities 
and norms ||'0|| 2 j are quadratic forms with respect to each of the matrices in the MPS. 
Regarding the matrices of the state as elements of a complex vector, we can rewrite 
Eq. (QJ for the fc-th site as 

= A\QB k , (4) 



O k 



where the quadratic form Q is built as follows 



Q[saa'][s'l30'] '■— {s\O k \s') 



n 



={As+l...i,l...fc-l} 



,(5) 



a/3,a'/3' 



where [sa(3] denotes grouping of indices consistent with the ordering of the elements 
of A and B. This formula is used on all algorithms, from the computation of ground 
states jHJ to the time evolution 0. An important optimization is to avoid computing 
the full matrix Q, but to use the structure in © together with the sparsity of the 
transfer matrices. 



2.2. Projector operator 

Even if the MPS do not form a vector space, they are embedded in a bigger Hilbert 
space and provided with a distance. It is therefore feasible, given an arbitrary state 
vector, to ask for the best approximation in terms of MPS of a fixed size. The optimal 
projection onto Sd is a highly nonlinear operation and it will be denoted in this work 
by Vd- Following Ref. [5] 

2 



P D ^2c k \<j> {k) ) := argmin 



k 



(6) 



If we rather want to approximate the action of an operator that can be decomposed 
as U = X^ 1 Y 1 we will apply a generalization of the correction vector method PP 

Vd {X- l Y\ct))) := argmin||A|^) - Y\<j>)\\. (7) 

This formula is simple to apply and behaves well for a singular operator X. Its use 
will become evident when studying time evolution algorithms in Sect.0] 

One may quickly devise a procedure for computing the minimum of Eq. J7J) based 
on the definition of distance: 

\\x\ii>) - rm 2 = &\x j xW - m^Y\cj>) + {4>\YiY\<j>). (8) 

All scalar products in Eq. © are quadratic forms with respect to each of the matrices 
in the states \<p) and \ip). The distance is minimized by optimizing these quadratic 
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Figure 1. (a) DMRG view of a state, with the basis for the left \a k —\) and 
right block |id!fc_|_x) , and the states of the central spins, \s k ),\ s k+l)- (b) On a 
renormalization step, one spin is incorporated to the left block and a new basis is 



built, \a k ) 



forms site by site, or two sites at a time§, sweeping over all lattice sites until 
convergence to a small value which will be the truncation error 0. 

In short the algorithm looks as follows: (i) Compute some initial guess for the 
matrices of the optimized state if), (ii) Focus on the site k = 1. (iii) Use Eq. (QJ-© to 
find out the quadratic form associated to Eq. JSJ) for the first matrix 



e := AlQ x1x A k - 2KA\Q x , Y B k + B\Q Y ^ Y B k 

(iv) The stationary points of the error are given by equation 

Qx<xA k = Q x , Y B k . (10) 

Solve this equation and use the outcome as the new value of A k . (v) Estimate the error 
using Eq. (|5J. If e is small enough or does not improve significantly, stop. Otherwise 
move to another site, k = k ± 1, and continue on step (iii). 



(9) 



3. MPS and Density Matrix Renormalization Group (DMRG) 

Even though the numerical techniques for dealing with MPS seem very different from 
those of DMRG pQ, both methods are intimately connected. First of all, the DMRG 
produces MPS at each state of its algorithms. Second, particular algorithms such 
as the search for ground states in open boundary condition problems are equivalent 
in DMRG and MPS Third, other concepts, such as basis adaptation and state 
transformation from DMRG are analogous to the MPS ones, even though they are 
less powerful and less accurate. In this section we will elaborate on these statements. 



3.1. DMRG builds matrix product states 

The DMRG algorithms are based on the key idea that interesting states can be 
expressed using a basis with a small number of vectors. Take for instance the chain 
in Fig. Any state of this chain can be decomposed in the form 

IV') = ip{a k -is k s k+1 a k +i)\a k -i)\s k )\s k+1 )\a k+1 ), (11) 

where we sum over repeated indices. While for describing individual spins we use all 
possible states, s k , s k+ i = 1 ... d, in DMRG the states of the left and right blocks are 
expressed in a finite basis, a kl (3 k — 1 . . .M, where M, the number of states kept, is 
the basic control parameter. DMRG algorithms build those basis states recursively, 



§ The two-sites alternative, borrowed from DMRG, has the advantage of not getting trapped in 
subspaces of conserved quantities that conmute with both X and Y. 
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by taking a smaller block, adding a site and truncating the basis of the bigger block 
"optimally" in a way ot be precised later. Thus we have the relations 

K) :=^ fc _ lQ >fc-i)|sfe-i}, (12«) 
\Pk) :=A s ^ k+i \s k+1 )\p k+1 ). (126) 

It is trivial to see, substituting those equations into Eq. that all DMRG states 
are matrix product states 01 • 

3. 2. Targetting 

An important question in DMRG is how many states we have to keep for the left and 
right blocks and how to optimize them. The criterium is that the basis describing 
those blocks has to represent accurately a family of target states, \4> n )- The algorithm 
consists on a series of sweeps over the lattice ^] EH QH 120] with a recipe to achieve 
an optimal representation. For instance, let us say that we have an approximate basis 
around sites k and k + 1 and we will improve this sweeping from left to right, to k + 1 
and k + 2 [Fig. ^ . The first step is to build the weighted density matrix of the left 
piece of the chain 

PL ■= ^2 w n (j) n (ak-iSkSk+iPk+i)<p7i(ci' k ^ 1 s' k s k+1 p k+1 )*\a k ^ 1 s k )(a' k _ 1 s' k \ 

n 

= ^2w n Tr Sk+iak+1 \4> n )((j) n \, (13) 
n 

with some normalized weights, ^2 n w n = 1. Second, take the M most significant 
eigenvectors of this matrix 

p\a k ) = A fc K), X k > Xj Vk > I; k,l = 1...M X d. (14) 

These vectors become the improved new basis for the enlarged left block and are related 
by a transformation matrix to the basis elements of the smaller block H12af) . Matrix 
elements of observables and of the initial and target states have to be recomputed 
using this isometry and one continues until the end of the lattice. A similar procedure 
is employed to create the vectors \(3k) recursively from right-to- left. Multiple sweeps 
can be performed this way. 

For static problems one uses as target states, \<j> n ), the ground state and a 
number of excitations, computed with the Hamiltonian in the truncated basis. In 
some time evolution algorithms [111 1121 1131 1141 l2H] the targetting is done with 
respect to approximations of the time evolved state \ip(t)). In this work we have 
used \4>k) '■= \ip(kAt/ (N v — 1))), where the intermediate and final states, \ip(r)) are 
computed using the Hamiltonian on the truncated basis [See Sect. I4.2| . 

3.3. Targetting vs. projection 

There are a number of complicated subtleties and facts that are rarely mentioned in 
the DMRG literature. The first one is that in most implementations of the targetting 
algorithm, the state itself is updated and rewritten in the new basis after each step 

|V>> -> ^(ak-iSkSk + if3k+i)A^ iak \a k )\sk}\sk+i)\Pk+2)- (15) 

However, this leads to wrong answers because the initial state ^(0) deteriorates during 
the algorithm, and it only works when the basis of the left and right block are already 
large enough. A more accurate procedure consists on keeping the initial state in the 
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Figure 2. Error in the DMRG Runge-Kutta algorithm [Sec Sect. 14.21 . using 
as initial state \ip) = \0)® N and Hamiltonians H = J^i^f (solid, dashed) or 
H — Y\i (dotted). The algorithm used either a single basis (dashed) or two set 
of basis states (solid, dotted) for targetting. The dashed line shows errors due to 
using a single basis, the dotted line fails because the Hamiltonian does not have 
nonzero elements on the initial DMRG basis. 

original basis, and on each step of the the algorithm compute its matrix elements in 
the new basis. As an example, in Fig. [21 we plot the error of an evolution algorithm 
with the trivial algorithm and with the correct one, for an initial product state |1)® L 
in the of basis and using a simple Hamiltonian H = J2i a f- I n both cases we begin 
with a basis of size M = 1 vectors per block, letting the basis grow up to 2 L / 2 . However 
only the second method is capable of doing the update. 

An even more important issue is that if in current literature this method is 
formulated in an intuitive way and little is known about why it works and how accurate 
it is. To this respect, a close look at the matrices (I12af) - (|12&|) created by the targetting 
with multiple vectors, |</> t ), shows that their procedure is quite similar to the projection 
operator for some linear combination VmiYlk Cfcl^fc}), but the steps for computing the 
optimal approximation are wrong if there are more than one target state. 

Another consequence of being tied to the notion of "optimal basis" instead of 
treating MPS as a variational family of wavefunctions, is that the targetting procedure 
does not work when the operators O n and the target states \<f> n ) have no elements in 
the initial basis. A trivial example consits on the same product state \ip(0)) = \1)® N as 
before, beginning with M = 1 states per block and evolved with H — JT i erf. As shown 
in Fig. |3 a conventional DMRG sweep cannot enlarge the basis in the appropiate way 
and the method fails to follow the evolution of the state. A MPS algorithm, on the 
other hand, even if it begins with a state with D = 1, grows up the state up to the 
maximal size D — 2 and makes absolutely no error in the simulation. 

The final practical difference is that in current DMRG works ^2 Id IHI the 
powers of the Hamiltonian acting on the original state, H n \4>), are approximated by the 
truncation of H to the current basis. This is another potential source of errors which 
we cleverly avoid in our Arnoldi and Runge-Kutta methods shown below [Sect. 14.31 
and !4.5j and its effect are be evident in some of the simulations [Sect.EJ- 

The previous paragraphs can be rephrased as follows: DMRG algorithms need 
not only an initial input state, but also a suitable and large enough basis for doing the 
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time evolution. How to construct this basis is largely heuristics, and contrasts with 
the systematic way in which MPS work. 



4. Time evolution 



Since Sd is not a vector space, we cannot solve a Schrodinger equation directly on it. 
Our goal is rather to approximate the evolution at short times by a formula like 

m+At))~r D {u n (At)m))}. (i6) 

Here, Vd is the optimal projection operator defined before and U n (At) = 
exp(—iHAt) + 0(At n ) is itself an approximation of n-th order to the evolution 
operator. Even though this formulation applies qualitatively to all recently developed 
MPS and DMRG algorithms, there are subtle differences that make some methods 
more accurate than other. The actual implementation of time evolution is thus the 
topic of the following subsection. 



4--1- Trotter decomposition 

While there had been previous works on simulating time evolution using DMRG 
algorithms |211 1221 123) . an important breakthrough happened with the techniques 
in Ref. which was later on translated to the DMRG language 003 El and have 
since been applied to a number of interesting problems [2]EH1EI>1EZ!- Vidal's seminal 
paper suggested doing the time evolution with a mixture of two-qubit quantum gates 
and truncation operations. The idea is to split a Hamiltonian with nearest-neighbor 
interactions into terms acting on even and odd lattice edges 

L/2 L/2 

H = J2H 2k + j2H 2k _ 1 =:H E + H . (17) 
fe=i fe=i 

This leads to a Suzuki- Trotter decomposition of the time evolution operator into a 
sequence of unitaries acting on even and odd lattice bonds: 

L/2 L/2 

e -iHAt ^ e -iHvAt e -iH At _ j~J £ -iH 2k At j~J e ~iH 2k + 1 At ^g-j 

k=l k=l 

Inserting optimal truncations in between the applications of these unitaries, Uj := 
exp(—iHjAt), Vidal's algorithm can be recasted in the form of Eq. (|TT)|l 

L/2 L/2 

\ip(t + At)) :=Y[V D U 2k i[r D U 2k+1 \xP(t)). (19) 

fe=i fc=i 

Applying each of the unitaries Uj is a relatively costless task, and in particular, for 
open boundary condition problems, the combination VdU j can be done in a couple 
of steps which amount to contracting neighboring matrices and performing a singular 
value decomposition 0] |SJ |U] ■ 

In Ref. we developed a variant of this method which does not insert so many 
truncation operators, but waits until the end 

L/2 L/2 

m + At)) :=P D l[U 2k i[U 2k+1 \ip(t)). (20) 

k=l k=l 
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This procedure has a similar computational cost, 0(ND 3 ), but the solution is then 
expected to be optimal for a given Trotter decomposition and can be generalized to 
problems with periodic boundary conditions. 

The accuracy of either method may be increased by an order of magnitude using 
a different second order decomposition 

e -iHAt _ e -iH E At/2 e -iH At e -iH E At/2 + Ql/^\ (21) 

but it is better to apply a Forest-Ruth formula [25] 

-iHAt _ -iH E 9At/2-iHo9At-iH E {l-9)At/2-iHo{l-29)At v 

xe -iH E (l-8)At/2 e ~iHo9At e -iH E 8At/2 (22) 

with the constant 8 = 1/(2 — 2 1 / 3 ) and which has a Trotter error of order 0(At 5 ). 
Note that better Suzuki- Trotter formulas can be designed but they do not provide a 
big improvement in the number of exponentials or accuracy EB1 . 

4-2. Runge-Kutta and Lanczos with DMRG 

After the development of the Trotter methods, in Ref. JI] we find a new algorithm in 
the field of DMRG. The idea now is to use not a Trotter formula, but a Runge-Kutta 
iteration: 

|fci) := AtH(t)\ip(0)}, (23a) 
\k 2 ) :=AtH(t)[|^(0)) + ||Ai>], (236) 
|* 3 > :=AiH(t)[^(0)) + ||fe>], (23c) 
\k A ) :=AtH(t)[\4>(0)) + \k 3 )]. (23d) 
These vectors are then used to interpolate the state at other times || 

|^(At)) ~ |^,(0)) + I ^[31|fci) + 14(|fc a ) + |fc 8 ))-5|fc4>], (24a) 
|V(2^)} c |^(0)) + &[16|fci) + 20(1*2) + |* 3 )) - 2|fc 4 )], (246) 
|^(Ai)) ^|V>(0)) + i[|fc 1 ) + 2(|A :2 ) + |fc 3 )) + |fc4)]. (24c) 

These three vectors, together with \4>(0)), are used to find an optimal basis using the 
targetting procedure explained in Sect. 13.21 Once the basis is fixed, the time evolution 
is performed with the same formulas but smaller time step, At/10. There are variants 
of this technique which approximate the time evolved state using a Lanczos method 
|29l I5U| with the truncated matrix of the Hamiltonian in the DMRG basis. The 
different submcthods differ on whether the preparation of the basis is done only using 
the final and initial state ^2] or multiple intermediate time steps |12l 113) . 

4-3. Runge-Kutta like method with MPS 

Our implementation of a 4-th order method with Runge-Kutta like formulas uses 
several simplifications. First of all, since our Hamiltonian is constant, a Runge-Kutta 
expansion becomes equivalent to a fourth-order Taylor expansion of the exponential 

4 i 

|V(At))= £-(i#At)"|^(O)) + 0(At 5 ) 

TV. 

~ (iHAt - Zl )(iHAt - z\)(iHAt - z 2 )(iHAt - z%)\i/j(0)) 
=: Y^YsY^O)). (25) 

| Notice the typo in Eq. (4) in Ref. ITTI. 
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Here we have rewritten the fourth order polynomial in terms of its roots, Z\ and Z2- 
Using the fact that we know an efficient algorithm to compute Vu^% acting on a MPS, 
we can write 

4 

|V(Ai)> :=n^#(G)>, (26) 

i=l 

which is our MPS Runge-Kutta-like algorithm. There are multiple reasons to proceed 
this way. On the one hand, we do not want to approximate higher powers of the 
Hamiltonian using a truncated basis [See Sect. Yd. 2] . On the other hand, if we treat 
expand the Hamiltonian to all powers, there will be too many operators and the 
complexity of the state will increase enormously. The previous decomposition has 
proven to be a good compromise. 



4-4 ■ Po.de approximations 

Runge-Kutta and in general polynomial approximations to the exponential are not 
unitary. Indeed, if we look at the eigenvalues of the evolution operator in either 
Eq. (|24c|) or Eq. we will see that they are of the form \ n = £^ =0 (i-Eiv At) n /n\, 
where E n are the eigenvalues of the Hamiltonian H. From this equation we see that 
| A„ | ^ 1 and some eigenmodes may grow exponentially, which is another way to say 
that Runge-Kutta algorithms are numerically unstable. 

There exist multiple implicit methods that eliminate the lack of unitarity and 
produce stable approximations. They receive the name "implicit" because the value 
of the state at a later time step At is obtained by solving an equation or inverting an 
operator. We will focus on Pade approximations to the exponential 

U " {M) = &- (27) 

which are computed with same order polynomials in the numerator and denominator 
|31| . The lowest order method is known as the Crank-Nicholson scheme, it arises from 
a second order discretization of the Schrodinger equation and has the well known form 

iWAt) = YT j W ^j- 2 . (28) 

It is easy to verify that the eigenvalues of this operator are just phases and that the 
total energy is a conserved quantity. The other method that we have used and which 
we compare in this work is a fourth order expansion 

1 - iAtH/2 - (AtH) 2 /12 
l + iAtH/2- (AtH) 2 /12' 

Applying either t/cN2 or C/cn4 on a matrix product state is equivalent to solving 
the problem in Eq. l(7|. where the operators X and Y are the denominator and the 
numerator in the previous quotients. 



ucMAt) = ^ 2 ;: . ( 29) 



4-5. Arnoldi method 

The last and most important method that we present in this paper combines many of 
the ideas explained before. First of all, we will use the fact that a linear combination 
of MPS, such as J2k=i c k\4>D^)^ resides in a space of bigger matrix product states, 
Sn„d, and it is thus a more accurate representation of the evolved state than a 
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single vector of size D. In the language of DMRG, N v vectors each of size D 
provide us with an effective basis of size N V D. This optimistic estimate is only 
possible when the vectors are indeed linearly independent. Our choice for an optimal 
decomposition will be therefore a Gramm-Schmidt orthogonalization of the Krylov 
subspace, {ip, Hip, H 2 ip . . .}, performed using MPS 

^^(^-g*©^)' (30) 

with initial condition \<fio) := \ip{0)). Defining the matrices Nik '■= {'Pjl'Pi) an d 
Hik :— (<f>j\H\<f>i) we compute an Arnoldi estimate of the exponential 

|V(Ai)> := V D ^[e- iAtJV " lH ] fc o|^ fc ). (31) 

k 

This algorithm involves several types of errors all of which can be controlled. First, 
the error due to using only N v basis vectors is proportional the norm of the vector 0jv„ 
as in ordinary Lanczos algorithms |29l I30| . Truncation errors arising from Vd can be 
also computed during the numerical simulation. Out of this errors, in our experience, 
the final truncation in Eq. (|31|l is the most critical one, since the other ones may be 
compesanted by adding more and more vectors. 

Finally, for completeness, in this work we have also implemented a Lanczos 
method. It differs from the previous one in that the basis is built orthogonalizing 
only with respect to the two previous vectors, so that Eq. (|30[) contains only three 
summands, as in ordinary Lanczos iterations. One then assumes that, due to the 
Hermiticity of the Hamiltonian, orthogonalitiy to the rest of the Krylov basis is 
preserved. Furthermore, if this is true, the effective matrices for H and iV are 
tridiagonal and can be constructed with a simple recurrence. This method has a 
potential gain of 0{oo/M\z) in speed due to the simplifications in Eq. (|30|) . However, 
as we will see later, truncation errors spoil the orthogonality of the Lanczos vectors 
and makes the method useless for small matrices. 



5. Comparison 

We tested all algorithms by simulating the evolution of the same state under a family 
of spin-i Hamiltonians with nearest-neighbor interactions 

H = J2 M + s t s t+i + ^44+i) + ■ (32) 

k 

As initial state we take the product | -0(0)) oc (|0) + |1})® L , where |0) and |1) are 
the eigenstates of s z . By restricting ourselves to "small" problems (L < 20), we can 
compare all algorithms with accurate solutions based on exact diagonalizations and the 
Lanczos algorithm |29U30) . Notice that we measure the error in the full wavefunction, 
£ := \\iPd{T) — U(T)ip(0)\\ 2 and not on the expectation values of simple correlators 
whose exact evolution is known |14j . The outcome of some of the simulations is in 
Figs. El and which we will discuss in the following paragraphs. 

The first set of simulations was done for 8 spins using matrices of size D = 16 
and a DMRG basis of similar size. Since Sd contains all possible states, Vd = L 
we expect no truncation errors in any of the algorithms. As shown in Fig. [3] for the 
XY model with 6 — 0.35 and A = 0, for medium to long time steps most errors 
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Figure 3. Error, e, vs. time step, At, for simulations of model 1321 with 8 spins, 
= 0.35, A = 0, D = 16 and T = 10. We compare (a) Trotter methods, (b) MPS 
algorithms and (c) DMRG algorithms. As a reference, all plots contain the error 
of the MPS Arnoldi and Lanczos methods (solid black line). 



show the expected behavior. Thus, the errors of the Trotter methods of second and 
fourth order follow the laws 0(At 2 ) and 0(At A ). Runge-Kutta, Taylor and Pade 
approximations have an error of 0(At 2 ), and for the Arnoldi and Lanczos methods 
with N v — 4 and 8 vectors we have a qualitative behavior 0(At Nv ^ 1 ). Since the 
size of the matrices and of the DMRG basis is very big, all these laws are followed, 
irrespective of whether the implementation uses DMRG or MPS. The only exception 
seems to be the DMRG Lanczos when implemented with only two target states. This 
method behaves more poorly than the counterpart with N v target states given by 
\</> k ) := \i>(kAt/(N v - 1)}, k = 0..N V - 1. 

Out of the methods that work as expected, all of them break the ideal laws at some 
point, acquiring an error of order 0(At~ 2 ). This error is exponential in the number 
of steps T / At and it signals the finite accuracy of the optimization algorithms, due 
to the limited precision of the computer. Roughly, since current computers cannot 
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Figure 4. Wavcfunction error vs. time step, At, for simulations of model 1321 
with 16 spins, 6 = 0.35, A = 0, and T = 10. In Fig. (a)-(d) we plot the outcome 
for different MPS sizes or DMRG basis, denoted by D. The association between 
methods and line types is that of Fig. O 

compute the norms of vectors, H^H 2 , with a relative error better than 10 -16 , a worst 
case estimate is e = (T/At) 2 10~ 16 , which perfectly fits these lines. 

Theoretically, the performance measured in computation time and memory use 
of all algorithms is of order (D(N v NhD 3 /N), where Njj is related to the number of 
operators in {X, Y, X^X, Y^Y, . . .}, N is the size of the problems and the additional 
factor N v only appears for Arnoldi and Lanczos methods. For periodic boundary 
conditions the cost increases by 0(D 2 ). However, we have explicitely avoided PBC 
problems because DMRG cannot handle them so accurately. In practice, out of all 
methods, the Pade expansions are the slowest ones, while the Trotter formulas, being 
local, are the fastest. In between we find the Arnoldi and Lanczos methods, which are 
nevertheless competitive if we consider their accuracy and the fact that they allow for 
longer time steps. 
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Figure 5. Same as in Fig. 151 but for N = 20 spins. 

In the remaining simulations we dropped the methods from Sect. 14.31 and 14.41 

because they have a similar computational cost and worse performance than the MPS 
Arnoldi method. We keep, on the other hand, all Trotter and DMRG methods and 
continue our study with bigger problems in which the MPS spaces and the DMRG 
basis are smaller than the limit required to represent all states accurately, D = d N / 2 
and M = d N /' 2 ~ x . More precisely, we choose N = 16 and 20 spins and try with 
M, D = 32, 64, 80 and 128. 

Now that the methods are potentially inexact, our previous error laws are modified 
by the introduction of truncation errors. The first thing we notice in Fig. 0Ji-c is 
that the Trotter error of second order is rather stable, its error being the same 
for MPS and DMRG. However, when we go for higher orders and increase the 
number of exponentials, truncation errors affect more strongly the Vidal and DMRG 
implementations than the MPS one. This shows the difference between making local 
truncations after each bond unitary (DMRG and Vidal) versus delaying truncations 
until the end and using the optimal projection 
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Another important conclusion is that all DMRG methods are very sensitive to 
truncation errors. As shown in Figs.0Ji-c, there is not a very big difference in accuracy 
between the DMRG Runge-Kutta and the Lanczos implementation, and both methods 
are not more accurate than the DMRG Forest-Ruth formula. The main reason why 
higher order methods from Sect. l4.2l do not improve the results is due to estimating the 
evolution with the matrix of the Hamiltonian on the truncated DMRG basis. Another 
reason is that under truncation, the Lanczos recurrence does no longer produce an 
orthogonal set of vectors. 

To prove those statements we compared with a Lanczos method implemented 
with MPS as explained in Sect. 14.51 This method does indeed have a better accuracy 
than the DMRG ones, which can be attributed to the use of the full Hamiltonian. 
The errors of the Lanczos are however larger than those of the Arnoldi method for a 
similar D and we have checked that under truncation the vectors of the Lanczos basis 
are not truly orthogonal. These small errors accumulate for shorter time steps, and 
only the Arnoldi method can correct them. 

As for the Arnoldi method, it has the greatest accuracy and seems to be stable as 
D becomes small. For very small matrices the error remains constant as we decrease 
the time step, but this is only because there is a lower bound in the approximation 
error given by \\V D \ip(T)) ~ \i>( T ))\\ 2 - FigureEt shows how the errors in the Arnoldi 
method are correlated to the errors made when approximating the exact solution with 
a MPS of fixed size. This plot illustrates the fact that all errors in this method are 
due to the final truncation. 

Summing up, one should use the method that allows for the longest time steps and 
the least number of truncations (or applications of Vd) and mathematical operations. 
All methods have an optimal time step which is a compromise between the errors in 
U n and the rounding and truncation errors made on each step. Regarding performance 
and accuracy, the two winning methods are MPS algorithms using either the fourth 
order Forest-Ruth decomposition or the Arnoldi basis. The last method however, has 
two advantages. One is that it can deal with nonlocal interactions and the second one 
is its potential for parallelizability, roughly 0(N v Nh / L), which all other presented 
algorithms lack. This can make it competitive with, for instance, increasing the size 
of the matrices in the Forest-Ruth method. Regarding DMRG methods, we find that 
they give comparable results only for big basis. When truncation errors pop in, their 
behavior is less predictable and it does not seem worth going with more elaborate 
algorithms (Lanczos, Runge-Kutta) vs. an ordinary Trotter formula. 

The fact that previous results are model-independent has been confirmed by a 
systematic scanning of all possible Hamiltonians in Eq. (|32|l . A selection is shown 
in Fig. The Arnoldi method is shown to be accurate, even for gapless problems. 
When the Arnoldi method fails it is due to truncation errors. In those situations the 
evolved state cannot be accurately represented by MPS (and by that matter also not 
by DMRG), simply because ||(1 — 'Pd)|V'(^))I| 2 i s finite and large [Fig. 01]. Increasing 
the number of Arnoldi vectors will also not help (See Fig. |HJd) for the same reason. 

6. Simulation of Feschbach resonances 

As a real-world application we have used the MPS Trotter and Arnoldi methods to 
simulate the conversion of bosonic atoms into molecules, when confined in an optical 
lattices and moving through a Feschbach resonance El EU ■ The goal is to study 
how correlation properties are transferred from the atoms into the molecules and how 
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Figure 6. Simulations of L = 16 spins with 8 = 0.35, A = 0. (a) Minimal 
truncation error e = ||(1 — V&4, exp(— iHt)\^(0))\\ 2 of the time evolved state and 
accumulated error made when simulating this Schrodinger equation using the 
Arnoldi method with 8 vectors, D = 64 and At = 0.16 (dashed), (b) Similar as 
before, errors in the Arnoldi method for varying number of vectors, T = 10 and 
At = 0.04, 0.16, 0.32 (bottom to top). 




Figure 7. Errors for various other spin s = 1/2 models as parametrized in 
Eq. 1321 . We have used a second order Trotter method (circles) and an Arnoldi 
method (solid), with 16 spins, T = 5 and D = 32. 



this dynamics is affected by atom motion and conversion efficliency. 

The effective model combines the soft-core Bose-Hubbard model used to describe 
the Tonks gas experiments with a coupling to a molecular state [331 

H=-J a \a a 3<y + ^Y- a L a L a i<r^° ( 33 ) 

+ Yl {( Em + U ™ n i a) ) n i m) + n[&JoiTOU + H - c -]} • 

i 

Here, a»f, and bi are bosonic operators for atoms in two internal states and the 
molecule; and nf 1 ^ are the total number of atoms and of molecules on each 
site, and we have the usual two-level coupling with Rabi frequency fl and detuning 
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Figure 8. Dynamics of a site with two atoms when the energy of the molecules 
is ramped linearly: E m = fti + 4fi(l — 2t/T). We plot (a) the instantaneous 
energy levels (solid), and (b) the fraction of atoms converted into molecules. 



A := E m — U m . For simplicity, we will assume that atoms and molecules interact 
strongly among themselves (Z7f,T> Ul,i, U m — > oo), so that we can treat them as hard- 
core, a\ a ,b1,ai a bi = 0. Also since molecules are heavier, we have neglected their 
tunneling amplitude, although that could be easily included. 

As the energy of the molecular state is shifted from E m ^> U m down to E m <C U m , 
the ground state of Eq. (|33[) changes from a pair of coupled of Tonks gases, to a purely 
molecular insulator. We want to study the dynamics of this crossover as E m is ramped 
slowly from one phase to the other. 

The simplest situation corresponds to no hopping: isolated atoms experience 
no dynamics, while sites with two atoms may produce a molecule. The molecular 
and atomic correlations at the end of the process are directly related to two-body 
correlations in the initial state |34l I35| . 

(mlm k )t=T ~ (nk^n kl ) t=0 , (34) 

{a\a k )t=T ~ (<4<Xfc} i= o - (nkin kl ) t=0 . (35) 

Therefore, this process can thus be used as a tool to probe quantum correlations 
between atoms. Studying the two-level system {o^a^|0), &||0}}, we conclude that 
for this process to work with a 90% efficiency, the ramping time should be larger than 
T ~ 1.5/0 [See Fig.^b)]. 

We have simulated numerically the ramping of small lattices, L = 10 to 32 
sites, with an initial number of atoms N^i — L/2,3L/4. The value of the molecular 
coupling has been fixed to O = 1 and the interaction has been ramped according to 
E m = f/fj, + 40(1 — 2t/T) using the ideal ramp time T = 1.5/0. We have used two 
particular values of the inter-species interaction, Uf i/O = 0, 2, and scanned different 
values of the hopping J/O E [0, 0.4]. The initial condition was always the ground state 
of the model with these values of Ufi/J and no coupling. These states contain the 
correlations that we want to measure. 

The main conclusions is that indeed the correlations of the molecules are almost 
those of the initial state of the atoms l)34[l. even for J = 0.4O when the process has 
not been adiabatic. An intuitive explanation is that hopping is strongly suppressed as 
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Figure 9. (a) Fraction of atoms converted into molecules vs. adimensionalized 
hopping amplitude, for f/f j = and f7jj = 1, from top to bottom. Circles 
show the outcome of the numerical experiment, while solid lines contain the ideal 
fraction 1511 , The case J = uses an initial condition J = 0.1Q and then switches 
off tunneling before ramping, (b) Correlations of the molecular state, (m\mj) t —T 
(circles) after the ramp, and those of the initial atomic state, (at, at-.Oj-f Oj'|)i = o 
(solid line). The plot for [7m = 5J has been shifted up by 0.2. 



we approach the resonance, due to the mixing between atomic states, which can hop, 
and molecular states, which are slower. We can say that the molecules thus pin the 
atoms and measure them. This explanation is supported by a perturbation analysis 
at J O, where one finds that a small molecular contamination slows the atoms on 
the lattice. This analysis breaks down, however, for J ~ Q, the regime in which the 
numerical simulations are required. 

7. Conclusions 

We have performed a rather exhaustive comparison of different methods for simulating 
the evolution of big, one-dimensional quantum systems 0] |SJ El El E3 EH El El EH 
with three other methods developed in this work. We find the MPS methods to be 
optimal both in accuracy and performance within the formulas of similar order. All 
procedures are substantially affected by truncation and rounding errors, and to fight 
the latter we must choose large integration time-steps. However, the only algorithm 
which succeeds for very large time-steps is an Arnoldi method developed in this work. 
Finally, this algorithm can be applied to problems with long range interactions. 

Using this algorithm, we have simulated the dynamics of cold atoms in a ID 
optical lattice when crossing a Feschbach resonance. The main conclusion is that with 
rather fast ramp times it is possible to map the correlations of the atomic cloud (two 
Tonks gases in this case) and use this as a measuring tool in current experiments. 
This result connects with similar theoretical predictions for fermions in Ref . (121 . 
Simple generalizations of this work will allow us in the future to analyze losses and 
creation of strongly correlated states with the help of the molecular component. 

As posible outlook, we envision the possibility of developing new algorithms in 
which the state is approximated by a linear combination of MPS at all times. This 
should be more efficient than increasing the size of the matrices, and could support 
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distributed computations in a cluster. 
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